On normalization and type checking for tree transducers
نویسنده
چکیده
Tree transducers are an expressive formalism for reasoning about tree structured data. Practical applications range from XSLT-like document transformations to translations of natural languages. Important problems for transducers are to decide whether two transducers are equivalent, to construct normal forms, give semantic characterizations, and type checking, i.e., to check whether the produced outputs satisfy given structural constraints. This thesis addresses these problems for important classes of tree transducers. Constructive solutions are provided and classes of transducers for which these algorithms run in polynomial time, are identified. Equivalence testing, normalization, and semantic characterization are often solved together by the use of a Myhill-Nerode theorem. This identifies necessary and sufficient semantic properties for transformations definable by a specific class of transducers. The theorem also implies that a unique normal form of those transducers exists. Moreover, it implies that, given a transducer, the normal transducer can be constructed. This immediately leads to the question: Are there classes of tree transducers for which a Myhill-Nerode theorem exists? We give an affirmative answer for the class of deterministic bottom-up tree transducers. A semantic characterization of transformations definable by these transducers is presented, and, moreover, it is evidenced that for every deterministic bottom-up tree transducer, a unique equivalent transducer can be constructed, which is minimal. The construction is based on a sequence of normalizing transformations, which, among others, guarantee that non-trivial output is produced as early as possible. For a deterministic bottom-up transducer where every state produces either none or infinitely many outputs, the minimal transducer can be constructed in polynomial time. One of the useful properties of tree walking transducers is decidability of type checking: Given a transducer and input and output types, it can be checked statically whether each document adhering to the input type is necessarily transformed by the transducer into documents adhering to the output type. Here, a “type” means a regular set of trees specified by a finite-state tree automaton. Usually, type checking of tree transducers is extremely expensive; already for simple top-down tree transducers it is known to be EXPTIME-complete. Are there expressive classes of tree transducers for which type checking can be performed in polynomial time? Most of the previous approaches are based on inverse type inference. In contrast, the approach presented here uses forward
منابع مشابه
Type Checking of Tree Walking Transducers
Tree walking transducers are an expressive formalism for reasoning about XSLT-like document transformations. One of the useful properties of tree transducers is decidability of type checking: given a transducer and input and output types, it can be checked statically whether the transducer is type correct, i.e., whether each document adhering to the input type is necessarily transformed into do...
متن کاملXML Type Checking for Macro Tree Transducers with Holes
Macro forest transducers (mfts) extend macro tree transducers (mtts) from ranked to unranked trees. Mfts are more powerful than mtts (operating on binary tree encodings) because they support sequence concatenation of output trees as build-in operation. Surprisingly, inverse type inference for mfts, for a fixed output type, can be done within the same complexity as for mtts. Inverse type inferen...
متن کاملTowards Practical Typechecking for Macro Tree Transducers
Macro tree transducers (mtt) are an important model that both covers many useful XML transformations and allows decidable exact typechecking. This paper reports our first step toward an implementation of mtt typechecker that has a practical efficiency. Our approach is to represent an input type obtained from a backward inference as an alternating tree automaton, in a style similar to Tozawa’s X...
متن کاملNormalization of Sequential Top-Down Tree-to-Word Transducers
We study normalization of deterministic sequential top-down tree-to-word transducers (stws), that capture the class of deterministic top-down nested-word to word transducers. We identify the subclass of earliest stws (estws) that yield unique normal forms when minimized. The main result of this paper is an effective normalization procedure for stws. It consists of two stages: we first convert a...
متن کاملFast: a Transducer-Based Language for Tree Manipulation
We introduce a tree manipulation language, Fast, that overcomes technical limitations of previous tree manipulation languages, such as XPath and XSLT which do not support precise program analysis, or TTT and Tiburon which only support trees over finite alphabets. At the heart of Fast is a combination of SMT solvers and tree transducers, enabling it to model programs whose input and output can r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011